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Abstract 

We describe an alternative method (to compression) that combines several 
theoretical and experimental results to numerically approximate the algo- 
rithmic (Kolmogorov-Chaitin) complexity of all Yln=i 2" bit strings up to 
8 bits long, and for some between 9 and 16 bits long. This is done by an 
exhaustive execution of all deterministic 2-symbol Turing machines with up 
to 4 states for which the halting times are known thanks to the Busy Beaver 
problem, that is 11019960 576 machines. An output frequency distribution 
is then computed, from which the algorithmic probability is calculated and 
the algorithmic complexity evaluated by way of the (Levin-Zvonkin-Chaitin) 
coding theorem. 

Keywords: algorithmic probability, algorithmic (program-size) complexity, 
halting probability, Chaitin's Omega, Levin's Universal Distribution, 
Levin-Zvonkin-Chaitin coding theorem. Busy Beaver problem, 
Kolmogorov-Chaitin complexity. 



1. Overview 

The most common approach to calculate the algorithmic complexity of 
a string is the use of compression algorithms exploiting the regularities of 
the string and producing shorter compressed versions. The length of a com- 
pressed version of a string is an upper bound of the algorithmic complexity 
of the string s. 
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In practice, it is a known problem that one cannot compress short strings, 
shorter, for example, than the length in bits of the compression program 
which is added to the compressed version of s, making the result (the program 
producing s) sensitive to the compressor choice and the parameters involved. 
However, short strings are quite often the kind of data encountered in many 
practical settings. While compressors' asymptotic behavior guarantees the 
eventual convergence to the algorithmic complexity of s, thanks to the in- 
variance theorem (to be enunciated later), measurements differ considerably 
in the domain of short strings. A few attempts to deal with this problem 



have been reported before [21[ . The conclusion is that estimators are always 
challenged by short strings. 

Atterapts to compute the uncomputable are always challenging, see for 
example (isl . HI, and more recently joj and . This often requires combin- 
ing theoretical and experimental results. In this paper we describe a method 
to compute the algorithmic complexity (hereafter denoted by C(s)) of (short) 
bit strings by running a set of (relatively) large number of Turing machines 
for which the halting runtimes are known thanks to the Busy Beaver problem 
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In the spirit of the experimental paradigm suggested in [23|, the method 



in this paper describes a way to find the shortest program given a standard 
formalism of Turing machines, executing all machines from the shortest (in 
number of states) to a certain (small) size one by one recording how many of 
them produce a string and then using a theoretical result linking this string 
frequency with the algorithmic complexity of a string. 

The result is a novel approach that we put forward for numerically calcu- 
late the complexity of short strings as an alternative to the indirect method 
using compression algorithms. The procedure makes use of a combination of 
results from related areas of computation, such as the concept of halting prob- 
ability the Busy Beaver problem 18 1, algorithmic probability ^^], Levin's 
semi-measure and Levin-Zvonkin-Chaitin's coding theorem (from now on cod- 



ing theorem) [ll|, [12 



The approach, never attempted before to the authors' knowledge, consists 
in the thorough execution of all 2-symbol Turing machines up to 4 states 
(the exact model is described in [3]) which, upon halting, generate a set of 
output strings from which a frequency distribution is calculated to obtain the 
algorithmic probability of a string. The algorithmic complexity of a string 
can then be evaluated from the algorithmic probability using the coding 
theorem. 
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The paper is structured as follows. In section [2] it is introduced the vari- 
ous theoretical concepts and experimental results utilized in the experiment, 
providing essential definitions and referring the reader to the relevant papers 
and textbooks. Section |3] introduces the definition of our empirical probabil- 
ity distribution D. InlHwe present the methodology for calculating D. In [5] 
we calculate D and provide numerical values of the algorithmic complexity 
for short strings by way of the theory presented in [21 particularly the cod- 
ing theorem. Finally, in [7] we summarize, discuss possible applications, and 
suggest potential directions for further research. 



2. Preliminaries 

2.1. The Halting problem and Chaitin's 

As widely known, the Halting problem for Turing machines is the problem 
of deciding whether an arbitrary Turing machine T eventually halts on an 
arbitrary input s. Halting computations can be recognized by simply running 
them for the time they take to halt. The problem is to detect non-halting 
programs, about which one cannot know if the program will run forever or 
will eventually halt. An elegant and concise representation of the halting 
problem is Chaitin's irrational number Vt jsj, defined as the halting proba- 
bility of a universal computer programmed by coin tossing. Formally, 

Definition 1. < ^7 = 2" 1^1 < 1 with \p\ the size of p in bits. 

VL is the halting probability of a universal (prefix-fre^ Turing machine 
running a random program (a sequence of fair coin fiip bits taken as a pro- 
gram) . 

For an Vt number one cannot compute more than a finite number of digits. 
The numerical value of f2 = VLu depends on the choice of universal Turing 
machine U . There are, for example, Vt numbers for which no digit can be 
computed 2^ . 



Knowing the first n bits of an VL allows to determine whether a program 
of length < n bits halts by simply running all programs in parallel until 
the sum exceeds that VL. All programs with length < n not halting yet will 



set of programs A is prefix-free if there are no two programs pi and p2 such that 
is a proper extension of pi. Kraft's inequality ^4,] guarantees that for any prefix-free set 
A, E.eA 2-1^1 <1. 
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never halt. Using these kind of arguments, Calude and Stay jsj have shown 
that most programs either stop "quickly" or never halt because the halting 
runtime (and therefore the length of the output upon halting) is ultimately 
bounded by its program-size complexity. The results herein connect theory 
with experiments by providing empirical values of halting times and string 
length frequencies. 

2.2. Algorithmic (prefix-free) complexity 

The algorithmic complexity Cu{s) of a string s with respect to a universal 
Turing machine U, measured in bits, is defined as the length in bits of the 
shortest (prefix-free) Turing machine U that produces the string s and halts 
lolliol, [111,3. Formally, 



Definition 2. Cu{s) = mm{\p\,U{p) = s} where \p\ is the length of p 
measured in bits. 

This complexity measure clearly seems to depend on U, and one may 
ask whether there exists a Turing machine which yields different values of 
C{s). The answer is that there is no such Turing machine which can be used 
to decide whether a short description of a string is the shortest (for formal 
proofs see j3.lli|). 

The ability of universal machines to efficiently simulate each other implies 



a corresponding degree of robustness. The invariance theorem [19l] states that 
if Cu{s) and Cu'{s) are the shortest programs generating s using the univer- 
sal Turing machines U and U' respectively, their difference will be bounded 
by an additive constant independent of s. Formally: 



Theorem (invariance jl9l]) 1. \Cu{s) -Cu'{s)\ < c 



u,u' 



A major drawback of C as a function taking s to the length of the shortest 
program producing s, is its non-computability proven by reduction to the 
halting problem. In other words, there is no program which takes a string s 
as input and produces the integer C{s) as output. 

2.3. Algorithmic probability 

Deeply connected to Chaitin's halting probability Q, is SolomonofT's con- 
cept of algorithmic probability, independently proposed and further formal- 
ized by Levin's [ll| semi-measure herein denoted by m{s). 
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Unlike Chaitin's fi, it is not only whether a program halts or not that 
matters for the concept of algorithmic probability; the output and halting 
time of a halting Turing machine are also relevant in this case. 

Levin's semi-measure m{s) is the probability of producing a string s with 
a random program p (i.e. every bit of p is the result of an independent toss 
of a fair coin) when running on a universal prefix-free Turing machine U. 
Formally, 

Definition 3. m(s) = Ep:c/{p)=s2"'^' 

Levin's probability measure induces a distribution over programs produc- 
ing s, assigning to the shortest program the highest probability and smaller 
probabilities to longer programs. 

There is a theorem connecting algorithmic probability to algorithmic com- 
plexity. Algorithmic probability is related to algorithmic complexity in that 
m{s) is at least the maximum term in the summation of programs given that 
it is the shortest program that has the greater weight in the summation of 
the fractions defining m{s). Formally, the theorem states that the following 
relation holds: 

Theorem (coding theorem ^) 2. -log2m(s) =C(s) + 0(l) 

Nevertheless, m(s) as a function of s is, like C{s) and Chaitin's Q, non- 
computable due to the halting problenjfl. 

2.4- The Busy Beaver problem: Solving the halting problem for small Turing 
machines 

Notation: We denote by (n, 2) the class (or space) of all n-state 2-symbol 
Turing machines (with the halting state not included among the n states). 

Definition 4. [l8| If ax is the number of Is on the tape of a Turing machine 
T upon halting, then: ^(n) = max {ax '■ T G (n, 2) T{n) halts}. 



■^An important property of m as semi-measure is that it dominates any other effective 
semi-measure /i because there is a constant such that, for aU s, m(s) > c^/i(s). For 
this reason m(s) is often called a universal distribution [9|. 



6 



Definition 5. [iS*] If is the number of steps that a machine T takes upon 
halting, then S{n) = maxjtj- : T G {n,2) T{n) halts}. 

^(ra) and S{n) as defined (and denoted by Busy Beaver functions) in 
4 and 5 are noncomputable by reduction to the halting problem 18|. Yet 
values are known for (n, 2) with n < A. The solution for {n, 2) with n < 3 
is trivial, the process leading to the solution in (3, 2) is discussed by Lin and 



Rado [15(1, and the process leading to the solution in (4, 2) is discussed in 

A program showing the evolution of all known Busy Beaver machines 
developed by one of this paper's authors is available online jisf. The Turing 
machine model followed in this paper is the same as the one described for 
the Busy Beaver problem as introduced by Rado [l8| . 

3. The empirical distribution D 

It is important to describe the Turing machine formalism because exact 
values of algorithmic probability for short strings will be provided under this 
chosen standard model of Turing machines. 

Definition 6. Consider a Turing machine with the binary alphabet S = 
{0, 1} and n states {1,2, . . .n} and an additional Halt state denoted by 



(just as defined in Rado's original Busy Beaver paper |18|). 



The machine runs on a 2-way unbounded tape. At each step: 

1. the machine's current "state" (instruction); and 

2. the tape symbol the machine's head is scanning 

define each of the following: 

1. a unique symbol to write (the machine can overwrite a 1 on a 0, a on 
a 1, a 1 on a 1, and a on a 0); 

2. a direction to move in: —1 (left), 1 (right) or (none, when halting); 
and 

3. a state to transition into (may be the same as the one it was in). 

The machine halts if and when it reaches the special halt state 0. There 
are [An + 2)^" Turing machines with n states and 2 symbols according to the 
formalism described above. 
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No transition starting from the halting state exists, and the blank symbol 
is one of the 2 symbols (0 or 1) in the first run, while the other is used in 
the second run (in order to avoid any asymmetries due to the choice of a 
single blank symbol). In other words, we run each machine twice, one with 
as the blank symbol (the symbol with which the tape starts out and is 
filled with), and an additional run with 1 as the blank symbo0. The output 
string is taken from the number of contiguous cells on the tape the head of 
the halting n-state machine has gone through. A machine produces a string 
upon halting. 

Definition 7. D{n) is the function that retrieves the number of machines 
that halt (denoted by d{n)) in (n, 2) and then assigns to every string s pro- 
duced by [n, 2) the quotient: (number of times that a machine in [n, 2) 
produces s) / (number of machines in {n,2) that halt). 

Examples of D{n) for n = l,n = 2: 

d{l) = 24, D{1) = ^ 0.5; 1 ^ 0.5 
d{2) = 6088, D{2) = ^ 0.328; 1 ^ 0.328; 00 ^ .0834 . . . 

Tables 1, 2 and 3 in |5] show the results for D{1), D{2) and D{3), and 
Table 4 the top ranking of -D(4). 

Theorem 3. D{n) is noncomputable. 

Proof (by reduction to the halting problem): The result is obvious, 
since from the knowledge of the number of n-state Turing machines that halt, 
it is easy to know for every Turing machine if it stops or not by the following 
argument (by contradiction): Assume D{n) is computable. Let T be any 
arbitrary Turing machine. To solve the halting problem for T, calculate 
D{n), where n is the number of states in T. Suppose that (by hypothesis) 
D{n) outputs d{n) and the assignation list of strings and frequencies. Run 
all possible n-state Turing machines in parallel, and wait until d{n) many of 



^Due to the symmetry of the computation, there is no real need to run each machine 
twice; one can complete the string frequencies assuming that each string produced its 
reversed and complemented version with the same frequency, and then group and divide 
by symmetric groups. A more detailed explanation of how this is done is in [3]. 
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the machines have halted. If T is one of the machines that has halted, then 
T halts. Otherwise, T doesn't halt. We have just shown that if D{n) were 
computable, then the halting problem would be solvable. Since the halting 
problem is known to be unsolvable, D must be noncomputable. 

Exact values of D{n) can be, however, calculated for small Turing ma- 
chines because of the known values (in particular S{n)) of the Busy Beaver 
problem for n < 5. For example, for n = 4, S'(4) = 107, so we know that 
any machine running more than 107 steps will never halt and so we stop it 
thereafter. 

For each Busy Beaver candidate with n > 4 states, a sample of Turing 
machines running up to the candidate S{n) is also possible. As for Rado's 
Busy Beaver functions and S'(n), D{n) is also approachable /rom above. 

For larger n, sampling methods asymptotically converging to D{n) can be 
used to approximate D{n). In section [5] we provide exact values of D{n) for 
n < 5 thanks to the the Busy Beaver known values. 

Another property shared between D{n) and the Busy Beaver problem is 
that -D(4), just as the values of the Busy Beaver, is well-defined in the sense 
that the calculation of the digits of D{n) are fully determined once calculated, 
but the calculation of D{n) rapidly becomes impractical to calculate, for even 
a slightly larger number of states. Our quest is thus similar in several respects 
to the Busy Beaver problem or the calculation of the digits of Chaitin's Vl 
number. The main underlying difficulty in analyzing thoroughly a given 
class of machines is the undecidability of the halting problem, and hence the 
uncomputability of the related functions. 

4. Methodology 

The approach for evaluating the complexity C{s) of a string s presented 
herein is limited by (1) the halting problem and (2) computing time con- 
straints. Restriction (1) was overcome using the values of the Busy Beaver 
problem providing the halting times for all Turing machines starting with a 
blank tape. Restriction (2) represented a challenge in terms of computing 
time and programming skills. It is also the same restriction that has kept 
others from attempting to solve the Busy Beaver problem for a greater num- 
ber of states. We were able to compute up to about 1.3775 x 10^ machines 
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per day or 15 943 per second, taking us about 9 day^ to run all (4, 2) Turing 
machines each up to the number of steps bounded by the Busy Beaver values. 

Just as it is done for solving small values of the Busy Beaver problem, 
we rely on the experimental approach to analyze and describe a computable 
fraction of the uncomputable. A similar quest for the calculation of the digits 
of a Chaitin's number was undertaken by Calude et al. [gJ, but unlike 
Chaitin's Q, the calculation of D{n) does not depend on the enumeration of 
Turing machines (because ). It is easy to see that every (2, n) Turing machine 
contributing to D{n) is included in D{n + 1) simply because every Turing 
machine in (2,n) is also in (2,n + 1). 

4.I. Numerical calculation of D 

We consider the space {n, 2) of Turing machines with < n < 5. The 
halting "history" and output probability followed by their respective run- 
times, presented in Tables 1, 2 and 3, show the times at which the programs 
in the domain of M halt, the frequency of the strings produced, and the time 
at which they halted after writing down the output string on their tape. 

We provide exact values for n = {2,3,4} in the Results [51 We derive 
D{n) for n < 5 from counting the number of n-strings produced by all (n, 2) 
Turing machines upon halting. We define D to be an empirical universal 
distribution in Levin's sense, and calculate the algorithmic complexity C of a 
string s in terms of D using the coding theorem, from which we won't escape 
to an additive constant introduced by the application of the coding theorem, 
but the additive constant is common to all values and therefore should not 
impact the relative order. One has to bear in mind, however, that the tables 
in section |5] should be read as dependent of this last-step additive constant 
because using the coding theorem as an approximation method fixes a prefix- 
free universal Turing machine via that constant, but according to the choices 
we make this seems to be the most natural way to do so as an alternative to 
other indirect choosing procedures. 



^Running on a MacBook Intel Core Duo at 1.83Ghz with 2Gb. of RAM memory and 
a solid state hard drive, using the TuringMachine[] function available in Mathematica 8 
for n < 4 and a C++ program for n — A. Since for n = 4 there were 2.56 x 10* machines 
involved, running on both and 1 as blank, further optimizations were required. The use 
of a Bignum library and an actual enumeration of the machines rather than producing 
the rules beforehand (which would have meant overloading the memory even before the 
actual calculation) was necessary. 
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We calculated the 72, 20 000, 15 059 072 and 22 039 921 152 two-way tape 
Turing machines started with a tape filled with Os and Is for D{2), D{3) 
and D(4)|^. The number of Turing machines to calculate grows exponentially 
with the number of states. For -D(5) there are 53119 845 582 848 machines 
to calculate, which makes the task as difficult as finding the Busy Beaver 
values for Yli^) 'S'(5), Busy Beaver values which are currently unknown 
but for which the best candidate may be >S'(5) = 47 176 870 which makes the 
exploration of (5, 2) a greatest challenge. 

Although several ideas exploiting symmetries to reduce the total number 
of Turing machines have been proposed and used for finding Busy Beaver 
candidates [H, 0, Isj in large spaces such as n > 5, to preserve the struc- 
ture of the data we couldn't apply all of them. This is because, unlike the 
Busy Beaver challenge, in which only the maximum values are important, 
the construction of a probability distribution requires every output to be 
equally considered. Some reduction techniques were, however, utilized, such 
as running only one-direction rules with a tape only filled with Os and then 
completing the strings by reversion and complementation to avoid running 
every machine a second time with a tape filled with Is. For an explanation 
of how we counted the number of symmetries to recuperate the outputs of 
the machines that were skipped see j2[. 

5. Results 

5.1. Algorithmic probability tables 

D{1) is trivial. (1,2) Turing machines produce only two strings, with the 
same number of machines producing each. The Busy Beaver values for n = 1 
are ^(1) = 1 and >S'(1) = 1. That is, all machines that halt do so after 1 
step, and print at most one symbol. 

Table 1: Distribution (-D(l)) from the d{l) = 24 machines in (1, 2) that hah, out of a total 
of 64 Turing machines. 






0.5 


1 


0.5 



^The space occupied by the outputs building 13(4) was 77.06Gb. 
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The Busy Beaver values for n = 2 are ^{1) = 4 and ,5(1) = 6. D{2) 
is quite simple but starts to display some basic structure, such as a clear 
correlation between string length and occurrence, following what may be an 
exponential decrease in the number of string occurrences: 



Pi\s 


= 1) 


= 0.657 


P{\s 


= 2) 


= 0.333 


P{\s\ 


= 3) 


= 0.0065 


P{\s\ 


= 4) 


= 0.0026 



Table 2: Distribution D{2) from 6 088 (2,2) out of 20000 Turing machines that halt. 
Each string is followed by its probability (from the number of times produced), sorted 
from highest to lowest. 



: .328 


010 : 


.00065 


1 : .328 


101 : 


.00065 


00 : 


.0834 


111 : 


.00065 


01 : 


.0834 


0000 


.00032 


10 : 


.0834 


0010 


.00032 


11 : 


.0834 


0100 


.00032 


001 


.00098 


0110 


.00032 


Oil 


.00098 


1001 


.00032 


100 


.00098 


1011 


.00032 


110 


.00098 


1101 


.00032 


000 


.00065 


1111 


.00032 



Among the various facts one can draw from D{2), there are: 

• There are d{2) = 6088 machines that halt out of the 20 000 Turing 
machines in (2, 2) as the result of running every machine over a tape 
filled with and then again over a tape filled with 1. 

• The relative string order in -D(l) is preserved in D{2). 

• A fraction of 1/3 of the total machines halt while the remaining 2/3 
do not. That is, 24 among 72 (running each machine twice with tape 
filled with 1 and as explained before). 

• The longest string produced by D{2) is of length 4. 
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• D{2) does not produce all Yli 2" = 30 strings shorter than 5, only 22. 
The missing strings are 0001, 0101 and 0011 never produced, hence 
neither were their complements and reversions: 0111, 1000, 1110, 1010 
and 1100. 

Given the number of machines to run, D{3) constitutes the first non triv- 
ial probability distribution to calculate. The Busy Beaver values for n = 3 
are ^(3) = 6 and S{3) = 21. 

Among the various facts for D{3): 

• There are d{3) = 4 294 368 machines that halt among the 15 059 072 in 
(3, 2). That is a fraction of 0.2851. 

• The longest string produced in (3, 2) is of length 7. 

• D{3) has not all ^[2" = 254 strings shorter than 7 but 128 only, half 
of all the possible strings up to that length. 

• -D(3) preserves the string order of D{2). 

D{3) ratifies the tendency of classifying strings by length with exponen- 
tially decreasing values. The distribution comes sorted by length blocks from 
which one cannot easily say whether those at the bottom are more random- 
looking than those in the middle, but one can definitely say that the ones at 
the top, both for the entire distribution and by length block, are intuitively 
the simplest. Both O'^ and its reversed l'^ for n < 8 are always at the top of 
each block, with and 1 at the top of them all. There is a single exception 
in which strings were not sorted by length, this is the string group 0101010 
and 1010101 that are found four places further away from their length block, 
which we take as a second indication of a complexity classification becoming 
more visible since these 2 strings correspond to what one would intuitively 
consider less random-looking because they are easily described as the repe- 
tition of two bits. 

D{4) with 22 039 921 152 machines to run was a true challenge, both in 
terms of programming specification and computational resources. The Busy 
Beaver values for n = 4 are ^(3) = 13 and S{n) = 107. Evidently every 
machine in (n, 2) for n < 4 is in (4, 2) because a rule in (n, 2) with n < 4 is 
a rule in (4, 2). The results are presented in 15. II and it is important to notice 
that the table presents the top of a much larger classification available online 



at |http : / / www . algor ithmicnature . org under the paper title as additional 



13 



Table 3: Probability distribution (-D(3)) produced by 
in (3,2). 



all the 15 059 072 Turing machines 



: 0.250 


11110 : 


0.0000470 




100101 : 


1.43x10"'' 




1 : 0.250 


00100 : 


0.0000456 




101001 : 


1.43x10"'' 




00 : 0.101 


11011 : 


0.0000456 




000011 : 


9.313x10" 


7 


01 : 0.101 


01010 : 


0.0000419 




000110 : 


9.313x10" 


7 


10 : 0.101 


10101 : 


0.0000419 




001100 : 


9.313x10" 


7 


11 : 0.101 


01001 : 


0000391 

\J ' KJ KJ KJ KJ f-f t-f -L 




001101 : 


9 313x10" 


7 


000 : 


0.0112 


01101 : 


0.0000391 




001111 : 


9.313x10" 


7 


111 : 


0.0112 


10010 : 


0.0000391 




010001 : 


9.313x10" 


7 


001 : 


0.0108 


10110 : 


0.0000391 




010010 : 


9.313x10" 


7 


Oil : 


0.0108 


OHIO : 


0.0000289 




010011 : 


9.313x10" 


7 


100 : 


0.0108 


10001 : 


0.0000289 




011000 : 


9.313x10" 


7 


110 : 


0.0108 


00101 : 


0.0000233 




011101 : 


9.313x10" 


7 


010 : 


0.00997 


01011 : 


0.0000233 




011110 : 


9.313x10" 


7 


101 : 


0.00997 


10100 : 


0.0000233 




100001 : 


9.313x10" 


7 


0000 : 


0.000968 


11010 : 


0.0000233 




100010 : 


9.313x10" 


7 


1111 : 


0.000968 


00011 : 


0.0000219 




100111 : 


9.313x10" 


7 


0010 : 


0.000699 


00111 : 


0.0000219 




101100 : 


9.313x10" 


7 


0100 : 


0.000699 


11000 : 


0.0000219 




101101 : 


9.313x10" 


7 


1011 : 


0.000699 


11100 : 


0.0000219 




101110 : 


9.313x10" 


7 


1101 : 


0.000699 


000000 


3.733x10" 


6 


110000 : 


9.313x10" 


7 


0101 : 


0.000651 


111111 


3.733x10" 


6 


110010 : 


9.313x10" 


7 


1010 : 


0.000651 


000001 


2.793x10" 


6 


110011 : 


9.313x10" 


7 


0001 : 


0.000527 


011111 


2.793x10" 


6 


111001 : 


9.313x10" 


7 


0111 : 


0.000527 


100000 


2.793x10" 


6 


111100 : 


9.313x10" 


7 


1000 : 


0.000527 


111110 


2.793x10" 


6 


0101010 


: 9.313x10 


-7 


1110 : 


0.000527 


000100 


2.333x10" 


6 


1010101 


: 9.313x10 


-7 


0110 : 


0.000510 


001000 


2.333x10" 


6 


001110 : 


4.663x10" 


7 


1001 : 


0.000510 


110111 


2.333x10" 


6 


011100 : 


4.663x10" 


7 


0011 : 


0.000321 


111011 


2.333x10" 


6 


100011 : 


4.663x10" 


7 


1100 : 


0.000321 


000010 


1.863x10" 


6 


110001 : 


4.663x10" 


7 


00000 


: 0.0000969 


001001 


1.863x10" 


6 


0000010 


: 4.663x10 


-7 


11111 


: 0.0000969 


001010 


1.863x10" 


6 


0000110 


: 4.663x10 


-7 


00110 


: 0.0000512 


010000 


1.863x10" 


6 


0100000 


: 4.663x10 


-7 


01100 


: 0.0000512 


010100 


1.863x10" 


6 


0101110 


: 4.663x10 


-7 


10011 


: 0.0000512 


011011 


1.863x10" 


6 


0110000 


: 4.663x10 


-7 


11001 


: 0.0000512 


100100 


1.863x10" 


6 


0111010 


: 4.663x10 


-7 


00010 


: 0.0000489 


101011 


1.863x10" 


6 


1000101 


: 4.663x10 


-7 


01000 


: 0.0000489 


101111 


1.863x10" 


6 


1001111 


: 4.663x10 


-7 


10111 


: 0.0000489 


110101 


1.863x10" 


6 


1010001 


: 4.663x10 


-7 


11101 


: 0.0000489 


110110 


1.863x10" 


6 


1011111 


: 4.663x10 


-7 


00001 


: 0.0000470 


111101 


1.863x10" 


6 


1111001 


: 4.663x10 


-7 


01111 


: 0.0000470 


010110 


1.43x10"^ 




1111101 


: 4.663x10 


-7 


10000 


: 0.0000470 


011010 


1.43x10"^ 
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material. Hence, among the 129 there are supposed to be the strings with 
greatest structure. The reader can verify that the closer to the bottom the 
more random-looking. 



Among the various facts from these results: 

• There are d{A) = 5 970 768 960 machines that halt in (4,2). That is a 
fraction of 0.27. 

• A total number of 1824 strings were produced in (4, 2). 

• The longest string produced is of length 16 (only 8 among all the 2^^ 
possible were generated). 

• The Busy Beaver machines (writing more Is than any other and halt- 
ing) found in (4, 2) had very low probability among all the halting 
machines: pr(11111111111101)= 2.01 x 10~^. Because of the reverted 
string (10111111111111), the total probability of finding a Busy Beaver 
in (4, 2) is therefore 4.02 x 10^^ only (or twice that number if the com- 
plemented string with the maximum number of Os is taken). 

• The longest strings in (4, 2) were in the string groups represented by the 
following strings: 1101010101010101, 1101010100010101, 101010101010 
1011 and 1010100010101011, each with about 5.4447x10"^° probabil- 
ity, i.e. an even smaller probability than for the Busy Beavers, and 
therefore the most random in the classification. 

• (4, 2) produces all strings up to length 8, then the number of strings 
larger than 8 rapidly decreases. The following are the number of strings 
by length |{s : |s| = Z}| generated and represented in -D(4) from a total 
of 1824 different strings. From i = 1, . . . , 15 the values 1 of |{s : |s| = 
n}\ are 2, 4, 8, 16, 32, 64, 128, 256, 486, 410, 252, 112, 46, 8, and 0, 
which indicated all 2' strings where generated for n < 8. 

• While the probability of producing a string with an odd number of Is is 
the same than the probability of producing a string with an even num- 
ber of Is (and therefore the same for Os), the probability of producing 
a string of odd length is .559 and .441 for even length. 

• As in D{3), where we report that one string group (0101010 and its 
reversion), in -D(4) 399 strings climbed to the top and were not sorted 
among their length groups. 

• In D{A) string length was no longer a determinant for string positions. 
For example, between positions 780 and 790, string lengths are: 11, 10, 
10, 11, 9, 10, 9, 9, 9, 10 and 9 bits. 
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Table 4: The top 129 strings from -D(4) with highest probabihty (therefore with lowest 
random complexity) from 1832 different produced strings. 



: 0.205 


01101 : 


0.000145 


110111 : 


0.0000138 




1 : 0.205 


10010 : 


0.000145 


111011 : 


0.0000138 




00 : 0.102 


10110 : 


0.000145 


001001 : 


0.0000117 




01 : 0.102 


01010 : 


0.000137 


011011 : 


0.0000117 




10 : 0.102 


10101 : 


0.000137 


100100 : 


0.0000117 




11 : 0.102 


00110 : 


0.000127 


110110 : 


0.0000117 




000 : 0.0188 


01100 : 


0.000127 


010001 : 


0.0000109 




111 : 0.0188 


10011 : 


0.000127 


011101 : 


0.0000109 




001 : 0.0180 


11001 : 


0.000127 


100010 : 


0.0000109 




Oil : 0.0180 


00101 : 


0.000124 


101110 : 


0.0000109 




100 : 0.0180 


01011 : 


0.000124 


000011 : 


0.0000108 




110 : 0.0180 


10100 : 


0.000124 


001111 : 


0.0000108 




010 : 0.0171 


11010 : 


0.000124 


110000 : 


0.0000108 




101 : 0.0171 


00011 : 


0.000108 


111100 : 


0.0000108 




0000 : 


0.00250 


00111 : 


0.000108 


000110 : 


0.0000107 




1111 : 


0.00250 


11000 : 


0.000108 


011000 : 


0.0000107 




0001 : 


0.00193 


11100 : 


0.000108 


100111 : 


0.0000107 




0111 : 


0.00193 


OHIO : 


0.0000928 


111001 : 


0.0000107 




1000 : 


0.00193 


10001 : 


0.0000928 


001101 : 


0.0000101 




1110 : 


0.00193 


000000 


: 0.0000351 


010011 : 


0.0000101 




0101 : 


0.00191 


111111 

iiiiii 


: 0.0000351 


iUllUU : 


n nnnm m 
U.UUUUIUI 




1010 : 


0.00191 


UUUUUi 


. n nnnm dK 
: U.UUUUiyo 


1 1 nm n . 
iiUUiU : 


n nnnm m 
U.UUUUIUI 




0010 : 


0.00190 


m 1 1 1 1 
Uiiiii 


. U.UUUUiyt) 


nm 1 nn . 
UUiiUU : 


y.y4ox iu 


6 


0100 : 


0.00190 


iUUUUU 


: U.UUUUiyo 


1 1 nm 1 . 
iiUUii : 


y.y4ox iu 


6 


1011 : 


0.00190 


1 1 1 1 1 n 
iiiiiU 


: U.UUUUiyo 


ni 1 1 1 n . 
UiiiiU : 


n AQQ \/ 1 n^ 

y.Doox iu 


6 


1101 : 


0.00190 


UUUUiU 


: U.UUUUio4 


1 nnnm . 
iUUUUi : 


y.ooox iu 


6 


0110 : 


0.00163 


UlUUUU 


: U.UUUUio4 


m 1 nm . 
UllUUl : 


n Q 1 n~6 

y.ox lu 




1001 : 


0.00163 


101111 


• 00001 84 


1 001 1 • 


Q X 1 0^6 




0011 : 


0.00161 


111101 


: 0.0000184 


000101 : 


8.753x10- 


6 


1100 : 


0.00161 


010010 


: 0.0000160 


010111 : 


8.753x10" 


6 


00000 


0.000282 


101101 


: 0.0000160 


101000 : 


8.753x10- 


6 


11111 


0.000282 


010101 


: 0.0000150 


111010 : 


8.753x10- 


6 


00001 


0.000171 


101010 


: 0.0000150 


001110 : 


7.863x10" 


6 


01111 


0.000171 


010110 


: 0.0000142 


011100 : 


7.863x10" 


6 


10000 


0.000171 


011010 


: 0.0000142 


100011 : 


7.863x10" 


6 


11110 


0.000171 


100101 


: 0.0000142 


110001 : 


7.863x10" 


6 


00010 


0.000166 


101001 


: 0.0000142 


001011 : 


6.523x10" 


6 


01000 


0.000166 


001010 


: 0.0000141 


110100 : 


6.523x10" 


6 


10111 


0.000166 


010100 


: 0.0000141 


000111 : 


6.243x10" 


6 


11101 


0.000166 


101011 


: 0.0000141 


111000 : 


6.243x10" 


6 


00100 


0.000151 


110101 


: 0.0000141 


0000000 


: 3.723x10 


-6 


11011 


0.000151 


000100 


: 0.0000138 


1111111 


: 3.723x10 


-6 


01001 


0.000145 


001000 


: 0.0000138 


0101010 


: 2.393x10 


-6 
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• D(4:) preserves the string order of D{3) except in 17 places out of 128 
strings in D{3) ordered from highest to lowest string frequency. The 
maximum rank distance among the farthest two differing elements in 
D{3) and -D(4) was 20, with an average of 11.23 among the 17 misplaced 
cases and a standard deviation of about 5 places. The Spearman's rank 
correlation coefficient between the two rankings had a critical value of 
0.98, meaning that the order of the 128 elements in D{3) compared to 
their order in D{4) were in an interval confidence of high significance 
with almost null probability to have produced by chance. 



Table 5: Probabilities of finding n Is (or Os) in (4, 2). 



number 




n of Is 


pr{n) 


1 


0.472 


2 


0.167 


3 


0.0279 


4 


0.00352 


5 


0.000407 


6 


0.0000508 


7 


6.5x10-6 


8 


1.31x10-6 


9 


2.25x10"^ 


10 


3.62x10-^ 


11 


1.61x10-^ 


12 


l.OOxlO-s 


13 


4.02x10-9 



These are the top 10 string groups (i.e. with their reverted and comple- 
mented counterparts) appearing sooner than expected and getting away from 
their length blocks. That is. their lengths were greater than the next string in 
the classification order): 11111111, 11110111, 000000000, 111111111, 0000100 
00, 111101111, 111111110, 010101010, 101010101, 000101010. This means 
these string groups had greater algorithmic probability and therefore less 
algorithmic complexity than shorter strings. 
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Table 6: String groups formed by reversion and complementation followed by the total 
machines producing them. 



string group 


^ occurrences 


0, 1 


1224440064 


01, 10 


611436144 


00, 11 


611436144 


001, oil, 100, 110 


215534184 


000, 111 


112069020 


010, 101 


102247932 


0001, 0111, 1000, 1110 


23008080 


0010, 0100, 1011, 1101 


22675896 


/""Y /"A /"A /"A 't 't 't 

0000, 1111 


14917104 


0101, 1010 


11425392 


0110, 1001 


9712752 


0011, 1100 


9628728 


00001, 01111, 10000, lino 


2042268 


00010, 01000, 10111, 11101 


1984536 


01001, 01101, 10010, 10110 


1726704 


00000, 11111 


1683888 


00110, 01100, 10011, 11001 


1512888 


nnim nimi imnn iimn 
OOiOi, OiOii, iOiOO, iiOiO 


1 A VOO /I /I 

14 /oz44 


00011, 00111, 11000, 11100 


1288908 


00100, 11011 


900768 


01010, 10101 


819924 


OHIO, 10001 


554304 


000001, 011111, 100000, 111110 


233064 


000010, 010000, 101111, 111101 


219552 


000000, mill 


209436 


010110, 011010, 100101, 101001 


169896 


001010, 010100, 101011, 110101 


167964 


000100, 001000, 110111, 111011 


164520 


001001, 011011, 100100, 110110 


140280 


010001, 011101, 100010, 101110 


129972 
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Table 15.11 displays some statistical information of the distribution. The 
distribution is skewed to the right, the mass of the distribution is therefore 
concentrated on the left with a long right tail, as shown in Fig. 2. 

Table 7: Statistical values of the empirical distribution function 15(4) for strings of length 
I = 8. 





value 


mean 


0.00391 


median 


0.00280 


variance 


0.0000136 


kurtosis 


23 


skewness 


3.6 



probability 












0.4 














0.3 














0.2 


• 












0.1 




• 


• 






string 




2 


3 


4 


5 


6 


7 length 



Figure 1: (4, 2) frequency distribution by string length. 

5.2. Derivation and calculation of the string's algorithmic complexity 

Algorithmic complexity values are calculated from the output probabil- 
ity distribution D{4) through the application of the coding theorem and 
partially presented in Table 15.21 The full results are available online at 
http : //www . algorithmicnature . org_ under the paper title as additional 
material. 

The largest algorithmic complexity value after the application of the cod- 
ing theorem was max{C(s) : s G D{4)} = 29 bits. When interpreted as 
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Table 8: The probability of producing a string of length I exponentially decreases as I 
linearly increases. The slowdown in the rate of decrease for string length Z > 8 is due to 
the few longer strings produced in (4, 2). 



length n 


pr{n) 


1 


0.410 


2 


0.410 


3 


0.144 


4 


0.0306 


5 


0.00469 


6 


0.000818 


7 


0.000110 


8 


0.0000226 


9 


4.69x10-6 


10 


1.42x10^6 


11 


4.9x10"^ 


12 


1.69x10^^ 



0.8 
0.6 



Figure 2: Probability density function of bit strings of length I — 8 from (4,2). The 
histogram (left) shows the probabilities to fall within a particular region. The cumulative 
version (right) shows how well the distribution fits a Pareto distribution (dashed) with 
location parameter k — 10. The reader may see but a single curve, that is because the 
lines overlap. -D(4) (and the sub-distributions it contains) is therefore log-normal. 
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Table 9: Top 180 strings sorted from lowest to highest algorithmic complexity.. 



0:2.29 


10110:12.76 


100100 


16.38 


0100000 


19 


10 


1:2.29 


01010:12.83 


110110 


16.38 


1011111 


19 


10 


00:3.29 


10101:12.83 


010001 


16.49 


1111101 


19 


10 


01:3.29 


00110:12.95 


011101 


16.49 


0000100 


19 


38 


10:3.29 


01100:12.95 


100010 


16.49 


0010000 


19 


38 


11:3.29 


10011:12.95 


101110 


16.49 


1101111 


19 


38 


000:5.74 


11001:12.95 


000011 


16.49 


1111011 


19 


38 


111:5.74 


00101:12.98 


001111 


16.49 


0001000 


19 


45 


001:5.79 


01011:12.98 


110000 


16.49 


1110111 


19 


45 


011:5.79 


10100:12.98 


111100 


16.49 


0000110 


19 


64 


100:5.79 


11010:12.98 


000110 


16.52 


0110000 


19 


64 


110:5.79 


00011:13.18 


011000 


16.52 


1001111 


19 


64 


010:5.87 


00111:13.18 


100111 


16.52 


1111001 


19 


64 


101:5.87 


11000:13.18 


111001 


16.52 


0101110 


19 


68 


0000:8.64 


11100:13.18 


001101 


16.59 


0111010 


19 


68 


1111:8.64 


01110:13.39 


010011 


16.59 


1000101 


19 


68 


0001:9.02 


10001:13.39 


101100 


16.59 


1010001 


19 


68 


0111:9.02 


000000 


14.80 


110010 


16.59 


0010001 


20 


04 


1000:9.02 


111111 


14.80 


001100 


16.62 


onion 


20 


04 


1110:9.02 


000001 


15.64 


noon 


16.62 


1000100 


20 


04 


0101:9.03 


011111 


15.64 


011110 


16.66 


1101110 


20 


04 


1010:9.03 


100000 


15.64 


100001 


16.66 


0001001 


20 


09 


0010:9.04 


111110 


15.64 


011001 


16.76 


0110111 


20 


09 


0100:9.04 


000010 


15.73 


100110 


16.76 


1001000 


20 


09 


1011:9.04 


010000 


15.73 


000101 


16.80 


1110110 


20 


09 


1101:9.04 


101111 


15.73 


010111 


16.80 


0010010 


20 


11 


0110:9.26 


111101 


15.73 


101000 


16.80 


0100100 


20 


11 


1001:9.26 


010010 


15.93 


111010 


16.80 


1011011 


20 


11 


0011:9.28 


101101 


15.93 


001110 


16.96 


1101101 


20 


11 


1100:9.28 


010101 


16.02 


011100 


16.96 


0010101 


20 


15 


00000:11.79 


101010 


16.02 


100011 


16.96 


0101011 


20 


15 


11111:11.79 


010110 


16.10 


110001 


16.96 


1010100 


20 


15 


00001:12.51 


011010 


16.10 


001011 


17.23 


1101010 


20 


15 


01111:12.51 


100101 


16.10 


110100 


17.23 


0100101 


20 


16 


10000:12.51 


101001 


16.10 


000111 


17.29 


0101101 


20 


16 


11110:12.51 


001010 


16.12 


111000 


17.29 


1010010 


20 


16 


00010:12.55 


010100 


16.12 


0000000:18.03 


1011010 


20 


16 


01000:12.55 


101011 


16.12 


1111111:18.03 


0001010 


20 


22 


10111:12.55 


110101 


16.12 


0101010:18.68 


0101000 


20 


22 


11101:12.55 


000100 


16.15 


1010101:18.68 


1010111 


20 


22 


00100:12.69 


001000 


16.15 


0000001:18.92 


1110101 


20 


22 


11011:12.69 


110111 


16.15 


0111111:18.92 


0100001 


20 


26 


01001:12.76 


111011 


16.15 


1000000:18.92 


0111101 


20 


26 


01101:12.76 


001001 


16.38 


1111110:18.92 


1000010 


20 


26 


10010:12.76 


011011 


16.38 


0000010:19.10 


1011110 


20 


26 
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program size values it is worth mention that after apphcation of the coding 
theorem the string frequencies obtained are often real numbers, one can ei- 
ther take the ceiling integer value or take it as a different (finer) measure 
closely related to algorithmic complexity, but not necessarily exactly the 
same (the Kolmogorov-Chaitin complexity is a norm, the Solomonoff-Levin 
complexity (algorithmic probability) is a frequency, the coding theorem says 
they converge in the limit). 



log 


frequency 


10^ 
10^ 




10" 




1000 




100 


— - string 

200 400 600 800 1000 1200 



Figure 3: (4, 2) output log-frequency plot, ordered from most to less frequent string. 



5.2.1. Same length string complexity 

The complexity classification 15.2.11 allows to make a comparison of the 
structure of the strings related to their calculated complexity among all the 
strings of the same length extracted from -D(4). 

5.2.2. Halting summary 




Figure 4: Graphs showing the halting probabilities among (n, 2), n < 5. The list plot on 
the left shows the decreasing probability of the number of halting Turing machines while 
the paired bar chart on the right allows a visual comparison between both halting and 
non-halting machines side by side. 
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Table 10: Algorithmic complexity classification from less to more random — for 7-bit 
strings extracted from -D(4) after application of the coding theorem. 



0000000:18.03 


1001000:20 


09 


0101001:20.42 


0000111:20 


99 


1111111:18.03 


1110110:20 


09 


0110101:20.42 


0001111:20 


99 


0101010:18.68 


0010010:20 


11 


1001010:20.42 


1110000:20 


99 


1010101:18.68 


0100100:20 


11 


1010110:20.42 


1111000:20 


99 


0000001:18.92 


1011011:20 


11 


0001100:20.48 


0011110:21 


00 


0111111:18.92 


1101101:20 


11 


0011000:20.48 


0111100:21 


00 


1000000:18.92 


0010101:20 


15 


1100111:20.48 


1000011:21 


00 


1111110:18.92 


0101011:20 


15 


1110011:20.48 


1100001:21 


00 


0000010:19.10 


1010100:20 


15 


0110110:20.55 


0111110:21 


03 


0100000:19.10 


1101010:20 


15 


1001001:20.55 


1000001:21 


03 


1011111:19.10 


0100101:20 


16 


0011010:20.63 


0011001:21 


06 


1111101:19.10 


0101101:20 


16 


0101100:20.63 


0110011:21 


06 


0000100:19.38 


1010010:20 


16 


1010011:20.63 


1001100:21 


06 


0010000:19.38 


1011010:20 


16 


1100101:20.63 


1100110:21 


06 


1101111:19.38 


0001010:20 


22 


0100010:20.68 


0001110:21 


08 


1111011:19.38 


0101000:20 


22 


1011101:20.68 


0111000:21 


08 


0001000:19.45 


1010111:20 


22 


0100110:20.77 


1000111:21 


08 


1110111:19.45 


1110101:20 


22 


0110010:20.77 


1110001:21 


08 


0000110:19.64 


0100001:20 


26 


1001101:20.77 


0010011:21 


10 


0110000:19.64 


0111101:20 


26 


1 A1 1 AA1 OA 'W 

1011001:20.77 


AA11A11 01 

0011011:21 


1 A 

10 


1001111:19.64 


1000010:20 


26 


0010110:20.81 


1100100:21 


10 


1111001:19.64 


1011110:20 


26 


0110100:20.81 


1101100:21 


10 


0101110:19.68 


0000101:20 


29 


1001011:20.81 


0110001:21 


13 


0111010:19.68 


0101111:20 


29 


1101001:20.81 


0111001:21 


13 


1000101:19.68 


1010000:20 


29 


0001101:20.87 


1000110:21 


13 


1010001:19.68 


1111010:20 


29 


0100111:20.87 


1001110:21 


13 


0010001:20.04 


0000011:20 


38 


1011000:20.87 


0011100:21 


19 


0111011:20.04 


0011111:20 


38 


1110010:20.87 


1100011:21 


19 


1000100:20.04 


1100000:20 


38 


0011101:20.93 


0001011:21 


57 


1101110:20.04 


1111100:20 


38 


0100011:20.93 


0010111:21 


57 


0001001:20.09 


0010100:20 


39 


1011100:20.93 


1101000:21 


57 


0110111:20.09 


1101011:20 


39 


1100010:20.93 


1110100:21 


57 
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In summary, among the (running over a tape filled with only): 12, 3 044, 
2147184 and 2 985 384480 Turing machines in (n, 2), n < 5, there were 36, 
10 000, 7529 536 and 11019 960 576 that halted, that is slightly decreasing 
fractions of 0.333..., 0.3044, 0.2851 and 0.2709 respectively. 

5.3. Runtimes investigation 

Runtimes much longer than the lengths of their respective halting pro- 
grams are rare and the empirical distribution approaches the a priori com- 
putable probability distribution on all possible runtimes predicted in As 
reported in j3| "long" runtimes are effectively rare. The longer it takes to 
halt, the less likely it is to stop. 
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Figure 5: Runtimes distribution in (4,2). 



Among the various miscellaneous facts from these results: 

• All 1-bit strings were produced at t = 1. 

• 2-bit strings were produced at all 2 < t < 14 times. 

• t = 3 was the time at which the first 2 bit strings of different lengths 
were produced (n = 2 and n = 3). 

• Strings produced before 8 steps account for 49% of the strings produced 
by all (4, 2) halting machines. 

• There were 496 string groups produced by (4,2), that is strings that 
are not symmetric under reversion or complementation. 

• There is a relation between t and n; no n-bit string is produced before 
t < n. This is obvious because a machine needs at least t steps to print 
t symbols. 

• At every time t there was at least one string of length n for 1 < n < t. 
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Table 11: Probability that a n-bit string among all n < 10 bit strings is produced at times 
t < 8. 
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Table 12: Probability that a n-bit string with n < 10 is produced at time t <7. 
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6. Discussion 

Intuitively, one may be persuaded to assign a lower or higher algorithmic 
complexity to some strings when looking at tables 9 and 10, because they 
may seem simpler or more random than others of the same length. We think 
that very short strings may appear to be more or less random but may be 
as hard to produce as others of the same length, because Turing machines 
producing them may require the same quantity of resources to print them 
out and halt as they would with others of the same (very short) length. 

For example, is 0101 more or less complex than 0011? Is 001 more or less 
complex than 010? The string 010 may seem simpler than 001 to us because 
we may picture it as part of a larger sequence of alternating bits, forgetting 
that such is not the case and that 010 actually was the result of a machine 
that produced it when entering into the halting state, using this extra state 
to somehow delimit the length of the string. No satisfactory argument may 
exist to say whether 010 is really more or less random than 001, other than 
actually running the machines and looking at their objective ranking accord- 
ing to the formalism and method described herein. The situation changes 
for larger strings, when an alternating string may in effect strongly suggest 
that it should be less random than other strings because a short description 
is possible in terms of the simple alternation of bits. Some strings may also 
assume their correct rank when the calculation is taken further, for example 
if we were able to compute D{5). 

On the other hand, it may seem odd that the program size complexity of 
a string of length / is systematically larger than / when I can be produced 
by a print function of length l+{the length of the print program}, and in- 
deed one can interpret the results exactly in this way. The surplus can be 
interpreted as a constant product of a print phenomenon which is particu- 
larly significant for short strings. But since it is a constant, one can subtract 
it from all the strings. For example, subtracting 1 from all vahics brings 
the complexity results for the shortest strings to exactly their size, which 
is what one would expect from the values for algorithmic complexity. On 
the other hand, subtracting the constant preserves the relative order, even 
if larger strings continue having algorithmic complexity values larger than 
their lengths. What we provide herein, besides the numerical values, is a 
hierarchical structure from which one can tell whether a string is of greater, 
lesser or equal algorithmic complexity. 

The print program assumes the implicit programming of the halting con- 
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figuration. In C language, for example, this is delimited by the semicolon. 
The fact then that a single bit string requires a 2 bit "program" may be inter- 
preted as the additional information represented by the length of the string; 
the fact that a string is of length n is not the result of an arbitrary deci- 
sion but it is encoded in the producing machine. In other words, the string 
not only carries the information of its n bits, but also of the delimitation 
of its length. This is different to, for example, approaching the algorithmic 
complexity by means of cellular automata-there being no encoded halting 
state, one has to manually stop the computation upon producing a string of 
a certain arbitrary length according to an arbitrary stopping time. This is a 
research program that we have explored before 2J] and that we may analyze 
in further detail somewhere else. 

It is important to point out that after the application of the coding the- 
orem one often gets a non-integer value when calculating C{s) from m{s). 
Even though when interpreted as the size in bits of the program produced 
by a Turing machine it should be an integer value because the size of a pro- 
gram can only be given in an integer number of bits. The non-integer values 
are, however, useful to provide a finer structure providing information on the 
exact places in which strings have been ranked. 

An open question is how much of the relative string order (hence the rela- 
tive algorithmic probability and the relative algorithmic complexity) of D{n) 
will be preserved when calculating D{i) for larger Turing machine spaces 
such that < n < 2. As reported here, D{n) preserves most of the string 
orders of D{n — 1) for 1 < n < 5. While each space {n,2) contains all 
(n — 1, 2) machines, the exponential increase in number of machines when 
adding states may easily produce strings such that the order of the previous 
distribution is changed. What the results presented here show, however, is 
that each new space of larger machines contributes in the same proportion 
to the number of strings produced in the smaller spaces, in such a way that 
they preserve much of the previous string order of the distributions of smaller 
spaces, as shown by calculating the Spearman coefficient indicating a very 
strong ranking correlation. In fact, some of the ranking variability between 
the distributions of spaces of machines with different numbers of states oc- 
curred later in the classification, likely due to the fact that the smaller spaces 
missed the production of some strings. For example, the first rank difference 
between -D(3) and -D(4) occurred in place 20, meaning that the string or- 
der in -D(3) was strictly preserved in -D(4) up to the top 20 strings sorted 
from higher to lower frequency. Moreover, one may ask whether the actual 
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frequency values of the strings converge. 



7. Concluding remarks 

We have provided numerical tables with values the algorithmic complexity 
for short strings, and we have shed light into the behavior of small Turing 
machines, particularly halting runtimes and output frequency distributions. 
The calculation of D{n) provides an empirical and natural distribution that 
does not depend on an additive constant and may be used in several practical 
contexts. The approach, by way of algorithmic probabihty, also reduces 
the impact of the additive constant given that one does not seem to be 
forced to make many arbitrary choices other than fixing a standard model of 
computation (as opposed to fixing a specific universal Turing machine). In 
other words, the approach is bottom-up rather than top-down. 

An interesting open question is how robust the produced complexity 
classifications are to variations in the computational description formalism, 
such as using Turing machines with one-directional tapes rather than bi- 
directional, or following completely different models such as n-dimensional 



cellular automata, or Post tag systems. We've shown in [2J] that reasonable 
formalisms seem to produce reasonable complexity classifications, in the sense 
that: a) they are close to what intuition would tell should be and b) they 
are statistically correlated with each other at various degrees of confidence. 
This is, however, a topic of current investigation. 
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